Overview

Dataset statistics

Number of variables21
Number of observations17908
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 MiB
Average record size in memory168.0 B

Variable types

Numeric17
Categorical4

Alerts

risk_score_4 is highly correlated with risk_score_5High correlation
risk_score_5 is highly correlated with risk_score_4High correlation
risk_score_4 is highly correlated with risk_score_5High correlation
risk_score_5 is highly correlated with risk_score_4High correlation
risk_score_3 is highly correlated with risk_score_5High correlation
risk_score_4 is highly correlated with risk_score_5High correlation
risk_score_5 is highly correlated with risk_score_3 and 1 other fieldsHigh correlation
ext_quality_score is highly correlated with ext_quality_score_2High correlation
ext_quality_score_2 is highly correlated with ext_quality_scoreHigh correlation
months_employed has 13201 (73.7%) zeros Zeros
years_employed has 694 (3.9%) zeros Zeros
current_address_year has 1486 (8.3%) zeros Zeros
personal_account_m has 263 (1.5%) zeros Zeros
personal_account_y has 398 (2.2%) zeros Zeros

Reproduction

Analysis started2022-12-13 14:36:59.620887
Analysis finished2022-12-13 14:37:57.576897
Duration57.96 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

entry_id
Real number (ℝ≥0)

Distinct17888
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5596977.616
Minimum1111398
Maximum9999874
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum1111398
5-th percentile1571268.4
Q13378998.75
median5608376
Q37805624.25
95-th percentile9563219.15
Maximum9999874
Range8888476
Interquartile range (IQR)4426625.5

Descriptive statistics

Standard deviation2562472.751
Coefficient of variation (CV)0.4578315167
Kurtosis-1.200819058
Mean5596977.616
Median Absolute Deviation (MAD)2214168.5
Skewness-0.01573015823
Sum1.002306752 × 1011
Variance6.566266598 × 1012
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
44553522
 
< 0.1%
81568392
 
< 0.1%
15400972
 
< 0.1%
59469022
 
< 0.1%
78451102
 
< 0.1%
97321272
 
< 0.1%
39036432
 
< 0.1%
92765062
 
< 0.1%
23526692
 
< 0.1%
67390412
 
< 0.1%
Other values (17878)17888
99.9%
ValueCountFrequency (%)
11113981
< 0.1%
11115121
< 0.1%
11116001
< 0.1%
11123151
< 0.1%
11125371
< 0.1%
11129071
< 0.1%
11140701
< 0.1%
11140891
< 0.1%
11142681
< 0.1%
11142751
< 0.1%
ValueCountFrequency (%)
99998741
< 0.1%
99994211
< 0.1%
99986781
< 0.1%
99978711
< 0.1%
99977961
< 0.1%
99971281
< 0.1%
99970791
< 0.1%
99959571
< 0.1%
99953381
< 0.1%
99951181
< 0.1%

age
Real number (ℝ≥0)

Distinct72
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.01541211
Minimum18
Maximum96
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum18
5-th percentile25
Q134
median42
Q351
95-th percentile63
Maximum96
Range78
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.87310691
Coefficient of variation (CV)0.276019834
Kurtosis-0.3603232559
Mean43.01541211
Median Absolute Deviation (MAD)9
Skewness0.3164827428
Sum770320
Variance140.9706677
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37570
 
3.2%
43559
 
3.1%
38547
 
3.1%
39546
 
3.0%
44543
 
3.0%
34531
 
3.0%
42522
 
2.9%
40520
 
2.9%
45515
 
2.9%
46513
 
2.9%
Other values (62)12542
70.0%
ValueCountFrequency (%)
1831
 
0.2%
1947
 
0.3%
2073
 
0.4%
2177
 
0.4%
22143
0.8%
23185
1.0%
24225
1.3%
25241
1.3%
26291
1.6%
27335
1.9%
ValueCountFrequency (%)
961
 
< 0.1%
891
 
< 0.1%
871
 
< 0.1%
863
 
< 0.1%
853
 
< 0.1%
846
< 0.1%
832
 
< 0.1%
824
 
< 0.1%
813
 
< 0.1%
8014
0.1%

pay_schedule
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size140.0 KiB
bi-weekly
10716 
weekly
3696 
semi-monthly
2004 
monthly
1492 

Length

Max length12
Median length9
Mean length8.549921823
Min length6

Characters and Unicode

Total characters153112
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowbi-weekly
2nd rowweekly
3rd rowweekly
4th rowbi-weekly
5th rowsemi-monthly

Common Values

ValueCountFrequency (%)
bi-weekly10716
59.8%
weekly3696
 
20.6%
semi-monthly2004
 
11.2%
monthly1492
 
8.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
bi-weekly10716
59.8%
weekly3696
 
20.6%
semi-monthly2004
 
11.2%
monthly1492
 
8.3%

Most occurring characters

ValueCountFrequency (%)
e30828
20.1%
y17908
11.7%
l17908
11.7%
k14412
9.4%
w14412
9.4%
-12720
8.3%
i12720
8.3%
b10716
 
7.0%
m5500
 
3.6%
h3496
 
2.3%
Other values (4)12492
8.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter140392
91.7%
Dash Punctuation12720
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e30828
22.0%
y17908
12.8%
l17908
12.8%
k14412
10.3%
w14412
10.3%
i12720
9.1%
b10716
 
7.6%
m5500
 
3.9%
h3496
 
2.5%
t3496
 
2.5%
Other values (3)8996
 
6.4%
Dash Punctuation
ValueCountFrequency (%)
-12720
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin140392
91.7%
Common12720
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e30828
22.0%
y17908
12.8%
l17908
12.8%
k14412
10.3%
w14412
10.3%
i12720
9.1%
b10716
 
7.6%
m5500
 
3.9%
h3496
 
2.5%
t3496
 
2.5%
Other values (3)8996
 
6.4%
Common
ValueCountFrequency (%)
-12720
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII153112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e30828
20.1%
y17908
11.7%
l17908
11.7%
k14412
9.4%
w14412
9.4%
-12720
8.3%
i12720
8.3%
b10716
 
7.0%
m5500
 
3.6%
h3496
 
2.3%
Other values (4)12492
8.2%

home_owner
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size140.0 KiB
0
10294 
1
7614 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters17908
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
010294
57.5%
17614
42.5%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
010294
57.5%
17614
42.5%

Most occurring characters

ValueCountFrequency (%)
010294
57.5%
17614
42.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number17908
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
010294
57.5%
17614
42.5%

Most occurring scripts

ValueCountFrequency (%)
Common17908
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
010294
57.5%
17614
42.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII17908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
010294
57.5%
17614
42.5%

income
Real number (ℝ≥0)

Distinct2284
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3657.214653
Minimum905
Maximum9985
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum905
5-th percentile1705
Q12580
median3260
Q34670
95-th percentile6585
Maximum9985
Range9080
Interquartile range (IQR)2090

Descriptive statistics

Standard deviation1504.890063
Coefficient of variation (CV)0.4114852986
Kurtosis0.8603130919
Mean3657.214653
Median Absolute Deviation (MAD)917.5
Skewness0.9702378559
Sum65493400
Variance2264694.103
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
327062
 
0.3%
325561
 
0.3%
305560
 
0.3%
326558
 
0.3%
321058
 
0.3%
314557
 
0.3%
316556
 
0.3%
309056
 
0.3%
326055
 
0.3%
329554
 
0.3%
Other values (2274)17331
96.8%
ValueCountFrequency (%)
9051
 
< 0.1%
10152
< 0.1%
10301
 
< 0.1%
10553
< 0.1%
10901
 
< 0.1%
10951
 
< 0.1%
11301
 
< 0.1%
11401
 
< 0.1%
11451
 
< 0.1%
11501
 
< 0.1%
ValueCountFrequency (%)
99851
< 0.1%
99701
< 0.1%
99251
< 0.1%
99151
< 0.1%
98851
< 0.1%
98701
< 0.1%
98341
< 0.1%
98131
< 0.1%
97551
< 0.1%
96601
< 0.1%

months_employed
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.186006254
Minimum0
Maximum11
Zeros13201
Zeros (%)73.7%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum11
Range11
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.400896749
Coefficient of variation (CV)2.024354206
Kurtosis3.250647666
Mean1.186006254
Median Absolute Deviation (MAD)0
Skewness2.046884127
Sum21239
Variance5.764305197
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
013201
73.7%
5942
 
5.3%
1866
 
4.8%
6842
 
4.7%
2516
 
2.9%
3416
 
2.3%
4309
 
1.7%
9222
 
1.2%
10209
 
1.2%
7199
 
1.1%
Other values (2)186
 
1.0%
ValueCountFrequency (%)
013201
73.7%
1866
 
4.8%
2516
 
2.9%
3416
 
2.3%
4309
 
1.7%
5942
 
5.3%
6842
 
4.7%
7199
 
1.1%
8144
 
0.8%
9222
 
1.2%
ValueCountFrequency (%)
1142
 
0.2%
10209
 
1.2%
9222
 
1.2%
8144
 
0.8%
7199
 
1.1%
6842
4.7%
5942
5.3%
4309
 
1.7%
3416
2.3%
2516
2.9%

years_employed
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.526859504
Minimum0
Maximum16
Zeros694
Zeros (%)3.9%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum16
Range16
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.259731727
Coefficient of variation (CV)0.6407206537
Kurtosis0.8738668707
Mean3.526859504
Median Absolute Deviation (MAD)1
Skewness0.9127174816
Sum63159
Variance5.106387479
MonotonicityNot monotonic
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
23856
21.5%
33526
19.7%
12420
13.5%
61990
11.1%
51956
10.9%
41919
10.7%
0694
 
3.9%
7589
 
3.3%
9376
 
2.1%
10287
 
1.6%
Other values (7)295
 
1.6%
ValueCountFrequency (%)
0694
 
3.9%
12420
13.5%
23856
21.5%
33526
19.7%
41919
10.7%
51956
10.9%
61990
11.1%
7589
 
3.3%
8216
 
1.2%
9376
 
2.1%
ValueCountFrequency (%)
162
 
< 0.1%
155
 
< 0.1%
146
 
< 0.1%
139
 
0.1%
1213
 
0.1%
1144
 
0.2%
10287
1.6%
9376
2.1%
8216
 
1.2%
7589
3.3%

current_address_year
Real number (ℝ≥0)

ZEROS

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.584710744
Minimum0
Maximum12
Zeros1486
Zeros (%)8.3%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q35
95-th percentile9
Maximum12
Range12
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.751936664
Coefficient of variation (CV)0.7676872308
Kurtosis0.04040600415
Mean3.584710744
Median Absolute Deviation (MAD)2
Skewness0.9034759668
Sum64195
Variance7.573155403
MonotonicityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
23628
20.3%
12772
15.5%
32762
15.4%
41994
11.1%
01486
8.3%
51286
 
7.2%
61182
 
6.6%
9692
 
3.9%
8652
 
3.6%
7617
 
3.4%
Other values (3)837
 
4.7%
ValueCountFrequency (%)
01486
8.3%
12772
15.5%
23628
20.3%
32762
15.4%
41994
11.1%
51286
 
7.2%
61182
 
6.6%
7617
 
3.4%
8652
 
3.6%
9692
 
3.9%
ValueCountFrequency (%)
1211
 
0.1%
11228
 
1.3%
10598
 
3.3%
9692
 
3.9%
8652
 
3.6%
7617
 
3.4%
61182
6.6%
51286
7.2%
41994
11.1%
32762
15.4%

personal_account_m
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.427183382
Minimum0
Maximum11
Zeros263
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q35
95-th percentile7
Maximum11
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.216440083
Coefficient of variation (CV)0.6467235149
Kurtosis-0.2342798516
Mean3.427183382
Median Absolute Deviation (MAD)1
Skewness0.7941275108
Sum61374
Variance4.912606642
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
26091
34.0%
12782
15.5%
62368
 
13.2%
31632
 
9.1%
41587
 
8.9%
51348
 
7.5%
71135
 
6.3%
9377
 
2.1%
0263
 
1.5%
8209
 
1.2%
Other values (2)116
 
0.6%
ValueCountFrequency (%)
0263
 
1.5%
12782
15.5%
26091
34.0%
31632
 
9.1%
41587
 
8.9%
51348
 
7.5%
62368
 
13.2%
71135
 
6.3%
8209
 
1.2%
9377
 
2.1%
ValueCountFrequency (%)
1148
 
0.3%
1068
 
0.4%
9377
 
2.1%
8209
 
1.2%
71135
 
6.3%
62368
 
13.2%
51348
 
7.5%
41587
 
8.9%
31632
 
9.1%
26091
34.0%

personal_account_y
Real number (ℝ≥0)

ZEROS

Distinct16
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.503350458
Minimum0
Maximum15
Zeros398
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.955568317
Coefficient of variation (CV)0.5581994553
Kurtosis1.761587292
Mean3.503350458
Median Absolute Deviation (MAD)1
Skewness1.164914873
Sum62738
Variance3.824247444
MonotonicityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
35490
30.7%
23801
21.2%
43400
19.0%
11295
 
7.2%
71125
 
6.3%
5824
 
4.6%
6721
 
4.0%
8584
 
3.3%
0398
 
2.2%
10113
 
0.6%
Other values (6)157
 
0.9%
ValueCountFrequency (%)
0398
 
2.2%
11295
 
7.2%
23801
21.2%
35490
30.7%
43400
19.0%
5824
 
4.6%
6721
 
4.0%
71125
 
6.3%
8584
 
3.3%
962
 
0.3%
ValueCountFrequency (%)
151
 
< 0.1%
146
 
< 0.1%
134
 
< 0.1%
1215
 
0.1%
1169
 
0.4%
10113
 
0.6%
962
 
0.3%
8584
3.3%
71125
6.3%
6721
4.0%

has_debt
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size140.0 KiB
1
14244 
0
3664 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters17908
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

Most occurring characters

ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number17908
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

Most occurring scripts

ValueCountFrequency (%)
Common17908
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII17908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
114244
79.5%
03664
 
20.5%

amount_requested
Real number (ℝ≥0)

Distinct98
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean950.4464485
Minimum350
Maximum10200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum350
5-th percentile400
Q1600
median700
Q31100
95-th percentile2600
Maximum10200
Range9850
Interquartile range (IQR)500

Descriptive statistics

Standard deviation698.5436832
Coefficient of variation (CV)0.7349637471
Kurtosis24.57411204
Mean950.4464485
Median Absolute Deviation (MAD)200
Skewness3.599579126
Sum17020595
Variance487963.2774
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6002503
14.0%
7002317
12.9%
4002252
12.6%
8001715
9.6%
9001466
8.2%
5001357
 
7.6%
11001094
 
6.1%
1200896
 
5.0%
1000477
 
2.7%
550244
 
1.4%
Other values (88)3587
20.0%
ValueCountFrequency (%)
35046
 
0.3%
3751
 
< 0.1%
4002252
12.6%
40113
 
0.1%
42510
 
0.1%
450224
 
1.3%
4753
 
< 0.1%
5001357
7.6%
50114
 
0.1%
5258
 
< 0.1%
ValueCountFrequency (%)
102001
 
< 0.1%
101004
 
< 0.1%
99003
 
< 0.1%
98003
 
< 0.1%
83001
 
< 0.1%
79001
 
< 0.1%
78001
 
< 0.1%
63001
 
< 0.1%
58001
 
< 0.1%
520020
0.1%

risk_score
Real number (ℝ≥0)

Distinct1411
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61086.30221
Minimum2100
Maximum99750
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum2100
5-th percentile35450
Q149350
median61200
Q372750
95-th percentile85700
Maximum99750
Range97650
Interquartile range (IQR)23400

Descriptive statistics

Standard deviation15394.25502
Coefficient of variation (CV)0.2520082975
Kurtosis-0.6774669325
Mean61086.30221
Median Absolute Deviation (MAD)11700
Skewness-0.02239834722
Sum1093933500
Variance236983087.6
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6030041
 
0.2%
5280038
 
0.2%
6285038
 
0.2%
7275037
 
0.2%
6810037
 
0.2%
7245037
 
0.2%
4935035
 
0.2%
3810035
 
0.2%
5925034
 
0.2%
6795034
 
0.2%
Other values (1401)17542
98.0%
ValueCountFrequency (%)
21001
< 0.1%
22501
< 0.1%
28001
< 0.1%
44501
< 0.1%
61001
< 0.1%
111001
< 0.1%
118501
< 0.1%
137501
< 0.1%
155001
< 0.1%
158501
< 0.1%
ValueCountFrequency (%)
997501
< 0.1%
996001
< 0.1%
995501
< 0.1%
994501
< 0.1%
993001
< 0.1%
992001
< 0.1%
991501
< 0.1%
990001
< 0.1%
989501
< 0.1%
989001
< 0.1%

risk_score_2
Real number (ℝ≥0)

Distinct17475
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6908776181
Minimum0.023258235
Maximum0.999997479
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.023258235
5-th percentile0.5328652102
Q10.6409930675
median0.6995605045
Q30.752886555
95-th percentile0.8201192014
Maximum0.999997479
Range0.976739244
Interquartile range (IQR)0.1118934875

Descriptive statistics

Standard deviation0.09047039328
Coefficient of variation (CV)0.1309499554
Kurtosis2.197558626
Mean0.6908776181
Median Absolute Deviation (MAD)0.055833193
Skewness-0.896817751
Sum12372.23639
Variance0.008184892061
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7120571433
 
< 0.1%
0.7387537823
 
< 0.1%
0.7263319333
 
< 0.1%
0.6322705883
 
< 0.1%
0.7672168073
 
< 0.1%
0.6825521013
 
< 0.1%
0.7478974793
 
< 0.1%
0.700473953
 
< 0.1%
0.6539899163
 
< 0.1%
0.7219050422
 
< 0.1%
Other values (17465)17879
99.8%
ValueCountFrequency (%)
0.0232582351
< 0.1%
0.0499261341
< 0.1%
0.0646698321
< 0.1%
0.11281
< 0.1%
0.1139344541
< 0.1%
0.1348016811
< 0.1%
0.1663050421
< 0.1%
0.1744942021
< 0.1%
0.1907697481
< 0.1%
0.2019403361
< 0.1%
ValueCountFrequency (%)
0.9999974791
< 0.1%
0.9999478991
< 0.1%
0.9997974791
< 0.1%
0.9880865551
< 0.1%
0.9771210081
< 0.1%
0.9747899161
< 0.1%
0.9730588241
< 0.1%
0.9502521011
< 0.1%
0.9495798321
< 0.1%
0.9473613451
< 0.1%

risk_score_3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3945
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8782757628
Minimum0.451371431
Maximum0.999023613
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.451371431
5-th percentile0.7711775
Q10.850882115
median0.88100422
Q30.912607739
95-th percentile0.964098133
Maximum0.999023613
Range0.547652182
Interquartile range (IQR)0.061725624

Descriptive statistics

Standard deviation0.05456319219
Coefficient of variation (CV)0.06212535345
Kurtosis0.9497467675
Mean0.8782757628
Median Absolute Deviation (MAD)0.030346561
Skewness-0.5879679624
Sum15728.16236
Variance0.002977141942
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.88118378527
 
0.2%
0.90122777926
 
0.1%
0.86754803425
 
0.1%
0.88112767125
 
0.1%
0.88110522525
 
0.1%
0.88104911125
 
0.1%
0.88111644824
 
0.1%
0.86753681124
 
0.1%
0.88119500824
 
0.1%
0.88113889423
 
0.1%
Other values (3935)17660
98.6%
ValueCountFrequency (%)
0.4513714311
< 0.1%
0.500751931
< 0.1%
0.5297180821
< 0.1%
0.5364742321
< 0.1%
0.5545430061
< 0.1%
0.6015442631
< 0.1%
0.6274465791
< 0.1%
0.634124171
< 0.1%
0.6421148321
< 0.1%
0.645425571
< 0.1%
ValueCountFrequency (%)
0.9990236131
< 0.1%
0.999012391
< 0.1%
0.998933831
< 0.1%
0.9988328251
< 0.1%
0.9978901061
< 0.1%
0.9978452151
< 0.1%
0.9978339921
< 0.1%
0.9977778781
< 0.1%
0.9977554322
< 0.1%
0.9977442091
< 0.1%

risk_score_4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17628
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.583154562
Minimum0.016724453
Maximum0.978932031
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.016724453
5-th percentile0.3693354299
Q10.5002080077
median0.5882078125
Q30.672394922
95-th percentile0.778850781
Maximum0.978932031
Range0.962207578
Interquartile range (IQR)0.1721869143

Descriptive statistics

Standard deviation0.1250612824
Coefficient of variation (CV)0.2144564933
Kurtosis-0.01810649831
Mean0.583154562
Median Absolute Deviation (MAD)0.086387109
Skewness-0.2697301031
Sum10443.1319
Variance0.01564032437
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.5969210943
 
< 0.1%
0.6209007813
 
< 0.1%
0.6216789063
 
< 0.1%
0.6329281253
 
< 0.1%
0.5769898443
 
< 0.1%
0.5111710943
 
< 0.1%
0.6131007812
 
< 0.1%
0.5686226562
 
< 0.1%
0.5947960942
 
< 0.1%
0.6126093752
 
< 0.1%
Other values (17618)17882
99.9%
ValueCountFrequency (%)
0.0167244531
< 0.1%
0.0196223831
< 0.1%
0.0834350781
< 0.1%
0.088818751
< 0.1%
0.0894046881
< 0.1%
0.0990600781
< 0.1%
0.1140078121
< 0.1%
0.1309953131
< 0.1%
0.1341523441
< 0.1%
0.136851
< 0.1%
ValueCountFrequency (%)
0.9789320311
< 0.1%
0.9768031251
< 0.1%
0.9494210941
< 0.1%
0.9468539061
< 0.1%
0.9399390631
< 0.1%
0.9380359381
< 0.1%
0.9289453121
< 0.1%
0.9282671881
< 0.1%
0.9191304691
< 0.1%
0.9163984371
< 0.1%

risk_score_5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17597
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7182519798
Minimum0.153367295
Maximum0.996259854
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.153367295
5-th percentile0.5103716527
Q10.633708075
median0.7251125425
Q30.8066806225
95-th percentile0.9081859225
Maximum0.996259854
Range0.842892559
Interquartile range (IQR)0.1729725475

Descriptive statistics

Standard deviation0.120697337
Coefficient of variation (CV)0.1680431665
Kurtosis-0.3402806097
Mean0.7182519798
Median Absolute Deviation (MAD)0.086148283
Skewness-0.2497580737
Sum12862.45645
Variance0.01456784717
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7440254793
 
< 0.1%
0.8276657842
 
< 0.1%
0.6744644132
 
< 0.1%
0.8446200862
 
< 0.1%
0.5898409392
 
< 0.1%
0.7366932182
 
< 0.1%
0.8276204342
 
< 0.1%
0.6225267852
 
< 0.1%
0.6178163062
 
< 0.1%
0.7325671642
 
< 0.1%
Other values (17587)17887
99.9%
ValueCountFrequency (%)
0.1533672951
< 0.1%
0.1609148851
< 0.1%
0.1711345621
< 0.1%
0.1903692571
< 0.1%
0.2144735341
< 0.1%
0.2147701111
< 0.1%
0.2530267291
< 0.1%
0.2585004681
< 0.1%
0.3143895931
< 0.1%
0.3341462231
< 0.1%
ValueCountFrequency (%)
0.9962598541
< 0.1%
0.9933634021
< 0.1%
0.9930023081
< 0.1%
0.9927986581
< 0.1%
0.9927122351
< 0.1%
0.9926001421
< 0.1%
0.9917393351
< 0.1%
0.9911583331
< 0.1%
0.9894042041
< 0.1%
0.9891072851
< 0.1%

ext_quality_score
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17463
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6231124359
Minimum0.010184
Maximum0.970249
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.010184
5-th percentile0.3972336
Q10.521735
median0.625944
Q30.72984075
95-th percentile0.843793
Maximum0.970249
Range0.960065
Interquartile range (IQR)0.20810575

Descriptive statistics

Standard deviation0.13972853
Coefficient of variation (CV)0.2242428845
Kurtosis-0.2640552949
Mean0.6231124359
Median Absolute Deviation (MAD)0.10402
Skewness-0.1991149062
Sum11158.6975
Variance0.01952406209
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.5808156
 
< 0.1%
0.824014
 
< 0.1%
0.5819433
 
< 0.1%
0.624013
 
< 0.1%
0.403583
 
< 0.1%
0.3846713
 
< 0.1%
0.6636223
 
< 0.1%
0.7913643
 
< 0.1%
0.6573883
 
< 0.1%
0.5982133
 
< 0.1%
Other values (17453)17874
99.8%
ValueCountFrequency (%)
0.0101842
< 0.1%
0.0128411
< 0.1%
0.0173391
< 0.1%
0.0220571
< 0.1%
0.0259761
< 0.1%
0.0266951
< 0.1%
0.0352181
< 0.1%
0.0454581
< 0.1%
0.0504271
< 0.1%
0.0553581
< 0.1%
ValueCountFrequency (%)
0.9702491
< 0.1%
0.9669531
< 0.1%
0.9636471
< 0.1%
0.9630751
< 0.1%
0.9580131
< 0.1%
0.9564211
< 0.1%
0.9560271
< 0.1%
0.9547321
< 0.1%
0.9536181
< 0.1%
0.9506821
< 0.1%

ext_quality_score_2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17469
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6220682099
Minimum0.006622
Maximum0.966953
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum0.006622
5-th percentile0.39609975
Q10.51967675
median0.6229735
Q30.72894
95-th percentile0.84288655
Maximum0.966953
Range0.960331
Interquartile range (IQR)0.20926325

Descriptive statistics

Standard deviation0.1398983016
Coefficient of variation (CV)0.2248922214
Kurtosis-0.2874343242
Mean0.6220682099
Median Absolute Deviation (MAD)0.1047615
Skewness-0.1750563576
Sum11139.9975
Variance0.01957153479
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.4808155
 
< 0.1%
0.8508585
 
< 0.1%
0.103585
 
< 0.1%
0.524014
 
< 0.1%
0.6677313
 
< 0.1%
0.751343
 
< 0.1%
0.7516593
 
< 0.1%
0.6374393
 
< 0.1%
0.5256893
 
< 0.1%
0.7400043
 
< 0.1%
Other values (17459)17871
99.8%
ValueCountFrequency (%)
0.0066221
< 0.1%
0.0101841
< 0.1%
0.0133461
< 0.1%
0.0139731
< 0.1%
0.0220572
< 0.1%
0.0266951
< 0.1%
0.0352181
< 0.1%
0.0455641
< 0.1%
0.0773321
< 0.1%
0.0936831
< 0.1%
ValueCountFrequency (%)
0.9669531
< 0.1%
0.9645591
< 0.1%
0.9636471
< 0.1%
0.9612441
< 0.1%
0.9607681
< 0.1%
0.9569651
< 0.1%
0.9567611
< 0.1%
0.9536011
< 0.1%
0.9529161
< 0.1%
0.9519381
< 0.1%

inquiries_last_month
Real number (ℝ≥0)

Distinct30
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.457225821
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size140.0 KiB

Quantile statistics

Minimum1
5-th percentile2
Q14
median6
Q38
95-th percentile13
Maximum30
Range29
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.67309251
Coefficient of variation (CV)0.5688344518
Kurtosis5.67117982
Mean6.457225821
Median Absolute Deviation (MAD)2
Skewness1.916539
Sum115636
Variance13.49160859
MonotonicityNot monotonic
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
62918
16.3%
52798
15.6%
42351
13.1%
31910
10.7%
71796
10.0%
81330
7.4%
21254
7.0%
9892
 
5.0%
10672
 
3.8%
11497
 
2.8%
Other values (20)1490
8.3%
ValueCountFrequency (%)
17
 
< 0.1%
21254
7.0%
31910
10.7%
42351
13.1%
52798
15.6%
62918
16.3%
71796
10.0%
81330
7.4%
9892
 
5.0%
10672
 
3.8%
ValueCountFrequency (%)
304
 
< 0.1%
295
 
< 0.1%
2811
 
0.1%
2717
 
0.1%
2616
 
0.1%
2511
 
0.1%
2420
0.1%
2317
 
0.1%
2232
0.2%
2144
0.2%

e_signed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size140.0 KiB
1
9639 
0
8269 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters17908
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Most occurring characters

ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number17908
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Most occurring scripts

ValueCountFrequency (%)
Common17908
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII17908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19639
53.8%
08269
46.2%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

entry_idagepay_schedulehome_ownerincomemonths_employedyears_employedcurrent_address_yearpersonal_account_mpersonal_account_yhas_debtamount_requestedrisk_scorerisk_score_2risk_score_3risk_score_4risk_score_5ext_quality_scoreext_quality_score_2inquiries_last_monthe_signed
0762967340bi-weekly13135033621550362000.7373980.9035170.4877120.5159770.5809180.380918101
1356042861weekly03180063271600301500.7385100.8810270.7134230.8264020.7307200.63072090
2693499723weekly01540600711450345500.6429930.7665540.5950180.7622840.5317120.53171270
3568281240bi-weekly05230061271700421500.6652240.9608320.7678280.7788310.7925520.59255281
4533581933semi-monthly035900522811100538500.6173610.8575600.6134870.6655230.7446340.744634120
5849242321weekly02303058271600748500.6771090.7587650.4956090.6647620.5925560.49255661
6794831326bi-weekly02795044161800508000.7380550.8732040.6664370.7003920.5841300.684130141
7429703643bi-weekly050000211211100691000.7983030.8417470.4019710.5687870.5259050.72590551
8649319132semi-monthly052603031411150640500.6524290.8024330.5938160.5603890.5694590.36945931
9890860551bi-weekly130550611421600597500.6246660.9685650.5099190.7496240.7586070.75860751

Last rows

entry_idagepay_schedulehome_ownerincomemonths_employedyears_employedcurrent_address_yearpersonal_account_mpersonal_account_yhas_debtamount_requestedrisk_scorerisk_score_2risk_score_3risk_score_4risk_score_5ext_quality_scoreext_quality_score_2inquiries_last_monthe_signed
17898215097639bi-weekly15215052531600382000.7637890.7373410.6015770.6178020.6661750.76617550
17899679934337bi-weekly032650415211200679500.7152180.9114290.6068960.7905310.4316650.53166521
17900710087231weekly03015021220450424500.6437780.9013960.6322840.8562310.6663990.56639961
17901180735544bi-weekly05025623631500545000.7118950.9114180.5225090.7128640.4849130.58491390
17902398322954bi-weekly02620521421600554500.6381830.9730200.5022340.7312390.5795570.67955760
17903994972831monthly03245053261700717000.6911260.9281960.6641120.8380120.7277050.62770520
17904944244246bi-weekly06525021331800518000.6485250.9708320.6992410.8447240.7749180.47491830
17905985759046weekly026850511811200596500.6779750.9181410.6879810.9391010.4720450.67204590
17906870847142bi-weekly02515035611400802000.6427410.8856840.4564480.6868230.4065680.40656831
17907149855929weekly126650410411600649500.7208890.8743720.5055650.6316190.8461630.84616341